Variance Reduction in Simulationsof Loss

نویسندگان

  • Rayadurgam Srikant
  • Ward Whitt
چکیده

We propose a new estimator of steady-state blocking probabilities for simulations of stochastic loss models that can be much more e cient than the natural estimator (ratio of losses to arrivals). The proposed estimator is a convex combination of the natural estimator and an indirect estimator based on the average number of customers in service, obtained from Little's law (L = W ), which exploits the known o ered load (product of the arrival rate and the mean service time). The variance reduction is often dramatic when the blocking probability is high and the service times are highly variable. The advantage of the combination estimator in this regime is partly due to the indirect estimator, which itself is much more e cient than the natural estimator in this regime, and partly due to strong negative correlation between the natural and indirect estimators. In general, when the variances of two component estimators are very unequal, the variance reduction from the optimal convex combination is about 1 2, where is the correlation between the component estimators. For loss models, the variances of the natural and indirect estimators are very unequal under both light and heavy loads. The combination estimator is e ective for estimating multiple blocking probabilities in loss networks with multiple tra c classes, some of which are in normal loading while others are in light and heavy loading, because the combination estimator does at least as well as either component estimator, and provides improvement as well. Subject classi cations: Simulation, e ciency: variance reduction for estimates of blocking probabilities; Queues, simulation: e cient simulation estimators for loss models; Communications: e cient simulation of loss networks Area of Review: Simulation This paper concerns variance reduction in the estimation of blocking probabilities in simulations of stochastic loss models. A stochastic loss model has one or more arrival processes, modeled as stochastic processes, and has the property that not all of these arrivals are admitted. We are interested in a long-runaverage or steady-state blocking probability, i.e., the long-run proportion of arrivals from one arrival process that are not admitted. The mathematical model is quite general; we assume that admitted arrivals each eventually spend some random time in service, possibly after waiting, and then depart. Otherwise, we only assume appropriate long-run averages exist; see (1){(5) below. The allowed model generality means that the model can be a complex loss network or resource-sharing model, perhaps with alternative routing, such as a model of a communication network; see Ross (1995). Moreover, there are no Markov or independence assumptions, so that it is possible to model complex tra c processes. Simulations of large complex loss networks can be very time consuming, often requiring hours or more. Thus, e ective variance reduction methods can be very useful. We propose an easily implemented estimator for blocking probabilities that can be remarkably e cient compared to the natural estimator (ratio of losses to arrivals). By \e cient" we mean low variance for given run length or, equivalently, short run length for given variance. the new estimator is a convex combination of the natural estimator and an indirect estimator based on the average number of customers in service, obtained from Little's law (L = W ). It turns out that the improvement over the natural estimator provided by the proposed method is especially dramatic when the holding times are highly variable and the blocking probability is relatively high. This is a practically important case for communication networks because multiple services (e.g., voice and computer lines) lead to highly variable holding times and interest in system response to failures leads to considering scenarios with relatively high blocking probabilities. Of course, the response to short-lived failures require transient analysis, but since serious link failures in telecommunications networks, such as are caused by backhoe accidents, persist for a substantial time compared to call holding times, there is serious interest in the steady-state behavior in the presence of failures. Since continued reliable service is desired, e ort is made to provide satisfactory service even in the presence of failures. Hence, simulation experiments are frequently conducted to estimate steady-state blocking probabilities under relatively heavy loads. The proposed procedure is also e ective for complex loss networks with multiple tra c classes, some of which are in normal loading while others are in light and heavy loading. The new combination estimator tends to be close to the appropriate component estimator depending on the loading, and provides improvement as well. The combination estimator would be useful even if it only selected the better component estimator, because their e ciency di ers dramatically in light and heavy loading. There is a substantial literature on variance reduction, as can be seen from Chapters 2 and 8 of Bratley, Fox and Schrage (1987). Fleming, Schae er and Simon (1995) also treat a class of loss models and achieve spectacular variance reduction in many cases by combining control variates and importance sampling. 1. Alternative Estimators We consider a general system to which arrivals come according to some stochastic process fA(t) : t 0g, i.e., A(t) records the number of arrivals in the interval [0; t]. Some of these arrivals are admitted to the system, after which they stay for a random time and then depart, while other arrivals are blocked and lost. Let fL(t) : t 0g be the stochastic process representing losses, i.e., L(t) is the number of losses in the interval [0; t]. Admitted customers may initially wait before beginning service, but they eventually enter service and then depart. Let fXn : n 1g be the successive service times of the admitted calls. Let N (t) and W (t) represent the number of customers in service and waiting, respectively, at time t. We make no detailed stochastic modeling assumptions, such as independence or Markov assumptions. We only assume that t 1A(t)! as t!1 ; (1) L(t)=A(t)! B as t!1 ; (2) (X1 + : : :+Xn)=n! 1 as n!1 ; (3) n̂(t) t 1 Z t 0 N (u)du! n as t!1 (4) 1 and W (t) t ! 0 as t!1 ; (5) all with probability 1 (w.p.1). Equations (1) and (2) together imply that L(t)=t ! B as t ! 1 w.p.1 as well. The limits ;B; 1 and n in (1), (2), (3) and (4) are the arrival rate, the (long-run-average or steady-state) blocking probability, the mean service time and the long-run-average or steady-state number of customers in service, respectively. Condition (5) implies that the long-run rate of customers entering service equals the long-run rate of admitted customers, (1 B). Condition (5) is clearly satis ed by the classical G/G/s/k model with s servers and k ( nite) extra waiting spaces, but it is also satis ed for other models. For example, the number of available servers could be random. There need not even be separate identi able servers for each customer. Alternatively, service might be completed in several stages at separate facilities. The point is that the framework provided by equations (1){(5) is very general, so that the proposed estimation procedure is widely applicable. Of course, wide applicability does not imply that the proposed estimation procedure is necessarily e ective in reducing variance. However, it is our experience that the method is indeed e ective for many parameter settings in many models. In this setting our object is to estimate the blocking probability B by simulation. The natural estimator is B̂N (t) = L(t)=A(t) : (6) By (2), B is the limit of B̂N (t) as t ! 1, so that the natural estimator is consistent. However, since the natural estimator is the ratio of two random quantities, it is a ratio estimator. Ratio estimators have some complications; e.g., in general they are biased: If the processes fL(t) : t 0g and fA(t) : t 0g have stationary increments, then EL(t)=EA(t) = B for each t, but in general EB̂N (t) 6= B. An estimator closely related to the natural estimator, which we call the simple estimator, is B̂S (t) = L(t) t ; (7) where is the arrival rate in (1) . Assuming that the process fA(t) : t 0g has stationary increments, = EA(1). Assuming that the process fL(t) : t 0g has stationary increments, the simple estimator B̂S (t) is unbiased for each t: EB̂S (t) = EL(1)= = B. Thus, the simple estimator might seem preferable to the natural estimator, but in Srikant and Whitt (1996) we showed, through examples and theory (Section 7 of that paper), that the simple and natural estimators tend to be nearly identical for large samples (in actual value as well as in distribution). We discuss the simple estimator somewhat more here in Section 6. In this paper we propose an alternative estimator that in some circumstances has signi cantly lower variance and is nearly as easy to construct. Our starting point is the indirect estimator B̂I(t) = 1 n̂(t) ; (8) where = is the o ered load and n̂(t) is as in (4). The indirect estimator B̂I (t) requires that we know the parameters and 1, which is usually the case in simulations. (There are exceptions. For example, we would not know if the arrival process of interest itself comes from over ows from another system with unknown blocking probability. This presumes that we are interested in the proportion of these over ows that are subsequently blocked. We would not know 1 if the service time included some unknown random waiting time.) The indirect estimator also requires that we record the statistic n̂(t), which is usually not di cult to do. The indirect estimator is obtained from Little's law (L = W ); if ;B; 1 and n are the limits in (1){(4), then the relation L = W applied to the service facility (but not the waiting room if there is any) yields (1 B) 1 = n or, equivalently, B = 1 (n= ), from which we obtain (8); see Whitt (1991, 1992a). Indirect estimation of queueing quantities by Little's law was studied by Law (1975), Carson and Law (1980) and Glynn and Whitt (1989), but they did not focus on loss models. Srikant and Whitt (1996) studied the performance of the estimators B̂I(t) and B̂N (t), and showed that B̂I(t) tends to be much more (less) e cient than B̂N (t) in heavy (light) loading. The advantage of B̂I(t) over B̂N (t) in heavy loading is much more dramatic than the previous results for indirect estimators for delay models; e.g., the variance reduction might be by a factor of 1000 or more (e.g., see the case = +6:0 in Table 1 of Srikant and Whitt). 2 The potential advantage of the indirect estimator in heavy loads should be evident. In heavy loads, the average number of customers in service typically tends to be nearly constant; e.g., it approaches s in the G/G/s/k model as the load increases. Thus, the variance of the indirect estimator typically decreases to 0 as the arrival rate increases. However, the variance of the natural estimator also approaches 0 if the arrival rate increases for a xed run length. Srikant and Whitt showed that the ratio of the variances, V arB̂I (t)=V arB̂N (t), goes to 0 as the arrival rate increases. Our proposed estimator is the combination estimator B̂C (t) = pB̂N (t) + (1 p)B̂I(t) ; (9) where p is appropriately chosen to reduce variance (see Section 2). The idea behind the combination estimator in (9) is the observation that B̂I(t) is decreasing in n̂(t), while B̂N (t) should tend to be increasing in n̂(t), so that B̂I (t) and B̂N (t) should be negatively correlated. We prove a supporting covariance inequality for a class of GI/GI/s/0 models (having s servers, no extra waiting room and independent sequences of i.i.d. interarrival times and holding times) in Section 8, but the ordering is intuitively reasonable in general. Obtaining variance reduction by indirect estimation and by combining di erent estimators are not new ideas. Indeed, the general idea that variance can be reduced by combining di erent estimators as in (9) is well known, e.g., see p. 63 of Bratley, Fox and Schrage (1987), but it evidently was not apparent that the combination estimator in (9) could provide truly signi cant improvement for loss models, but it can, as is demonstrated by our examples in Section 4. For instance, in the best case in our examples of GI/GI/s/0 models with s = 100 the variance ratio is V arB̂N (t)=V arB̂C(t) 1800 (see Table 1). Only part of this bene t would be achieved by the indirect estimator alone; in this case V arB̂N (t)=V arB̂I(t) 200. The variance ratio of 1800 means that the run length for the combined estimator B̂C(t) could be about 1800 times shorter than the run length for the natural estimator B̂N (t) in order to produce the same statistical precision. This variance reduction would reduce a 30 minute run to less than 1 second. We show that the variance reduction provided by the indirect and combination estimators is even greater when we add a nite waiting room. If a waiting room of size 100 is added to the GI/GI/s/0 model with s = 100, then the variance reduction in the best case jumps from 103 to 106 or more; see Section 4.4. The advantage of the waiting room should also be evident, because then the mean occupancy n̂(t) is even less variable. (Recall that n̂(t) is the number of customers in service, not the number of customers in the system.) However, it turns out that the bene t of the combination estimator is not uniform in the model parameters. The combination estimator tends to provide dramatic improvement under heavy loads, signi cant improvement under normal loads, and moderate improvement under light loads. We show that the performance of the combination estimator can be explained by the variance ratio r2 V arB̂I (t)=V arB̂N (t) and the correlation Corr(B̂I(t); B̂N (t)). In Section 2 we show that, in general, the variance reduction of a combination estimator is about 1 2 when the variance ratio r2 is either very large or very small. As shown by Srikant and Whitt, the variance ratio r2 tends to be very large under light loads and very small under heavy loads. Loss model examples show that the correlation tends to be quite strongly negative under all loadings, but especially under heavy loads (e.g., see Table 1). As shown for B̂I(t) by Glynn and Whitt, a key ingredient in the proposed estimator B̂C (t) is exploiting the known parameters and 1. However, there are other ways to take advantage of this knowledge, in particular, through linear control estimators. Thus, we also consider linear control estimators, using estimators of the arrival rate and mean service time as control variables. (Glynn and Whitt (1989) show that from the perspective of asymptotic e ciency it su ces to consider linear control estimators in the class of suitably smooth nonlinear control estimators.) For this purpose, let ̂(t) = t 1A(t) (10) and ̂ 1(t) = (1=D(t))D(t) Xi=1 Xi ; (11) where Xi is the service time of the ith customer to complete service and D(t) is the number of departures (of admitted customers after receiving service) in [0; t]. Linear control estimators can be considered with 3 respect to each of the estimators B̂N (t); B̂I (t) and B̂C (t). One is B̂LN (t) = B̂N (t) + a1(̂(t) ) + a2(̂ 1(t) 1) ; (12) where a1 and a2 are chosen appropriately. The corresponding linear control estimator constructed from B̂I (t) is denoted B̂LI (t). The grand combination estimator is B̂GC(t) = B̂C(t) + b1(̂(t) ) + b2(̂ 1(t) 1) (13) = pB̂N (t) + (1 p)B̂I (t) + b1(̂(t) ) + b2(̂ 1(t) 1) ; where the three parameters p; b1 and b2 are chosen appropriately. The grand combination estimator B̂GC(t) in (13) (with the best parameters) clearly should be most e cient overall, and that is our experience. However, we nd that the combination estimator B̂C(t) in (9) consistently performs nearly as well as the grand combination estimator B̂GC (t) in (13), so that it should su ce to use the more elementary combination estimator. Our examples show that linear control estimators can also signi cantly reduce variance. The variance reduction for estimates of blocking probabilities tends to be greater than the variance reduction for standard single-server queues using similar control variates; see Lavenberg, Moeller and Welch (1982). However, the combination estimator B̂C(t) consistently does at least as well as, and in some cases does signi cantly better than, the linear control estimators B̂LN (t) and B̂LI(t). It is well known that the blocking probabilities in the M/GI/s/0 model (with Poisson arrival process) are insensitive to the general holding-time distribution beyond its mean; e.g., see p. 271 of Wol (1989). However, in Srikant and Whitt we showed that the variance of the estimators B̂N (t) and B̂I(t) do not have this insensitivity property. Indeed, for the M/GI/s/0 model these variances tend to be proportional to 1+c2s, where c2s is the squared coe cient of variation (SCV, variance divided by the square of the mean) of the holding-time distribution. In contrast, the variance of the new combination estimator B̂C(t) tends to be nearly insensitive to the holding-time distribution beyond its mean; see Sections 4.1, 4.2 and 5. This partly explains the e ectiveness of the combination estimator. In our previous paper we developed predictions for the variance of the estimators B̂N (t) and B̂I (t) in the G/G/s/0 model, to be used before any data have been collected. We have yet to develop such predictions for the new estimators proposed here. We only know that the variance should be less than minimum of the variances of B̂N (t) and B̂I (t) for the G/G/s/0 model. Hence, the previous predictions can yield upper bounds for G/G/s/0 models. Our previous paper focused on the computational e ort required to achieve a given statistical precision with the basic estimators. We remark that the story for loss models (G/G/s/0) is quite di erent from the story for delay models (G/G/s/1); see Whitt (1989). In particular, for loss models there is no precipitous rise as the tra c intensity approaches 1. Indeed, the case in which the tra c intensity is 1 is called normal loading. Figure 1 of Srikant and Whitt shows that the computational e ort to achieve a given statistical precision (using a criterion of absolute error) increases with the o ered load for the natural estimator. However, Figure 2 of Srikant and Whitt shows that the computational e ort decreases with the o ered load for the indirect estimator. With the combination estimator, normal loading (the middle) requires the most computational e ort. It is good, then, that the combination estimator provides signi cant variance reduction there. The methods here would be broadly applicable to estimate blocking probabilities from real-time measurements of actual loss systems, provided that we could estimate the o ered load too during the measurement process. Hence, it is also natural to consider the modi ed indirect estimator B̂M (t) = 1 n̂(t) ̂(t) ; (14) where ̂(t) = ̂(t)̂ 1(t) ; (15) and the associated modi ed combination estimator B̂MC(t), de ned as in (4) with B̂M (t) in place of B̂I(t). Unfortunately, however, we found that these modi ed estimators do not provide signi cant improvement. The variance ratio V arB̂N (t)=V arB̂MC(t) in our examples was consistently about 1. Hence we do not 4 display results for these estimators. In Section 6 we explain the importance of knowing and when we use the indirect estimator B̂I (t) or the combination estimator B̂C(t). It is of course possible that we could obtain good estimates of and 1 from previous measurements. In a network application we might monitor the system and have available estimates of 1 and 1. There might then be a failure event, which would make it desirable to estimate blocking probabilities. Assuming that the parameters and 1 are not altered by the failure event, we can use the previous estimates of 1 and 1 in the combination estimator to estimate the blocking probability from measurements after the failure event. We now investigate the general combination variance reduction approach more carefully. 2. Variance Reduction for Combination Estimators Part of the bene t of the combination estimator B̂C (t) in heavy loads comes from the indirect estimator B̂I (t), which Srikant andWhitt (1996) have shown to be signi cantlymore e cient than the natural estimator B̂N (t) in heavy loads. To understand the two di erent contributions to e ciency in heavy loads, it is useful to represent the variance ratio as the product of two separate variance ratios, i.e., V arB̂C (t) V arB̂N (t) = V arB̂C(t) V arB̂I(t) V arB̂I(t) V arB̂N (t) : (16) It is interesting to see how the variance ratio V arB̂C(t)=V arB̂I (t) is a ected by the fact that the variance ratio V arB̂I(t)=V arB̂N (t) is quite small. In this section we show that the variance ratio V arB̂C(t)=V arB̂I (t) depends on two key factors: the variance ratio V arB̂I(t)=V arB̂N (t) and the correlation Corr(B̂I(t); B̂N (t)). To express the problem generically, let X and Y be random variables with a common mean and let Z = pX + (1 p)Y : (17) Let 2 X = V arX, 2 Y = V arY , r = Y = X and = Cov(X;Y )= X Y . Clearly, the variance ratio is r2 and the correlation is . By direct calculation, V arZ V (p) = 2 Y p2 r2 + (1 p)2 + 2p(1 p) r : (18) Di erentiating, we nd that V 00(p) > 0 for all p, so that the minimum is found by setting V 0(p) = 0. The minimum variance of the combination variable Z is attained at p = r(r ) 1 + r2 2 r (19) and is V (p ) = 2 Y (1 2) 1 + r2 2r : (20) Note that in general we can have p < 0 and p > 1 in (19), but if 0, then necessarily 0 < p < 1. Assume that 2 Y 2 X , so that r 1. Then we want to compare V arZ to 2 Y , since it is more e cient (has lower variance) than V arX. For this purpose, let the combination variance reduction factor as a function of p be R(p) = V (p) 2 Y = p2 r2 + (1 p)2 + 2p(1 p) r (21) and let the optimal combination variance reduction factor be R(p ) = V (p ) 2 Y = 1 2 1 + r2 2r : (22) We can use (22) to bound below the possible variance reduction: R(p ) 1 2 (1 + r)2 1 2 4 (23) 5 for r 1. If r = 1, then R(p ) = (1 + )=2, which is only signi cant when is suitably close to its lower limit 1. If is indeed close to 1, then the lower bound can be approximated by 1 2 4 = (1 )(1 + ) 4 1 + 2 ; which agrees with what is achieved when r = 1. We are especially interested in the case of small r. From (22), we see that lim r!0R(p ) = 1 2 ; (24) which, interestingly, is independent of the sign of . Note that the limit of R(p ) as r ! 0 di ers from the lower bound over all r in (20), which is attained at r = 1, only by a factor of 4. In the case of small r, the variance reduction in (16) is approximately the product of 1 2 and r2. The combination estimator helps under heavy loads because is then often quite close to 1. It is also interesting to see how p behaves as r! 0. From (19), we see that p (r) r ! as r ! 0 ; so that we have p r for small r. More generally, if we let p=r! c as r ! 0, then R(p)! c2 + 1 + 2c ; (25) by (21). We can use (25) to see how errors in p a ect the variance reduction. An asymptotic relative error in p corresponds to p=r! c as r ! 0 with c = (1 + ). Then R(p (1 + )) = 1 2 + 2 2 ; (26) so that an asymptotic relative error in p yields an absolute loss of variance reduction (increase in R) of 2 2, which is less than 2. Hence, for small r, an relative error in p will have negligible impact if 2 is suitably small compared to 1 2. If we do not know , but we know r, then we could let p = r. From (21), R(r) = 1 + (1 p)2 + 2(1 p) = 1 + (1 p)2 2(1 p) + 2(1 p)(1 + ) = p2 + 2(1 p)(1 + ) r2 + 1 2 ; (27) which is not too di erent from 1 2 when r is su ciently small. Indeed, if r2 1 2, then the variance reduction in (16) is approximately r4, i.e., each step then contributes equally and the overall reduction is the one-step reduction squared. 3. Estimation Procedures There are a variety of ways to implement the estimation procedures presented so far. What is appropriate depends on the speci c model. We now describe what we have done for the models considered here (in Section 4). 3.1. Run Length and Initial Conditions We try to avoid most serious statistical problems by having relatively long runs. For the M/M/s/0 model with s = 100 and service rate 1, we let the measurement interval be 104. When the arrival rate is = 100, this means that the expected number of arrivals during the run is 106. Since the steady-state blocking probability is then about 0.07, the expected number of losses is 7 104. In the M/M/s/0 model and more general GI/M/s/0 model (with renewal arrival process), losses are regeneration points, so that segments between successive losses are i.i.d. Since B 1 is one plus the expected 6 number of arrivals between successive losses, B 1 could be estimated in this framework by the sample mean of 7 104 i.i.d. random variables. We do not actually use this estimation procedure and we do not restrict attention to GI/M/s/0 models, but this analysis shows that the sample size is indeed quite large. We do not discuss the issue of required path length for loss models at length here, because we already did so in Srikant and Whitt. We start each run with an empty system. Since that initial condition introduces bias, we have a warmup period, i.e., we wait a xed time before collecting any data. (The full run begins after the warmup period.) As shown in Section 11 of Srikant andWhitt, the warmup period for loss models often need not be extraordinarily long to make the initial bias negligible. For the M/M/s model, we let the warmup period be 50, which corresponds to 50 mean service times. This is more than adequate for the M/M/s/0 model (then about 5 is adequate), but is appropriate for the more variable hyperexponential service times that we also consider in some of our examples. We note that nite-capacity models tend to require shorter warmup periods than in nite-capacity models, because the maximum number of customers that can be the nite-capacity system is constrained. In an in nite-capacity system a longer time is required to reach levels that are captured by the tail of the steadystate content distribution. As indicated in Section 11 of Srikant and Whitt, the selection of a warmup period can be aided by considering the behavior of associated in nite-server models. In the M/G/1 model with a holding-time cdf G having mean 1, the time-dependent number of busy servers starting empty has a Poisson distribution with mean EN (t) = n(1 Z 1 t Gc(u)du) ; (28) where n is the steady-state mean; see (21) on p. 740 of Eick, Massey and Whitt (1993). (In (73) of Srikant and Whitt, En̂(t) should be replaced by EN (t) or (73) should be En̂(t) n = nt Z t 0 Hc e(u)du ; where He is the stationary-excess service-time cdf there.) Since the Poisson distribution is fully characterized by its mean, it is reasonable to measure the time to approach steady state in terms of the time for the mean to approach within a proportion of its steady state mean. From (28), n EN (t) n = (29) if and only if Z 1 t Gc(u)du = : (30) Equation (30) leads us to choose a warmup period of 5 in the M/M/s/0 model. (Then the integral reduces to e t.) It is signi cant that equation (28) remains valid in the much more general G/GI/s/0 model, see Remark 2.3 of Massey and Whitt (1993), so that it is reasonable to use (30) more general. However, the Poisson distribution property is lost when the arrival process is not required to be Poisson. Thus the full distribution may not be close to the steady-state distribution when the means are close. Nevertheless, (30) seems like a useful practical criterion. 3.2. Estimating Variances and Covariances To estimate variances and covariances, we use simple batch means; i.e., we divide the total run (after the warmup period) into n nonoverlapping batches of equal length and construct batch means. Since the runs are relatively long, there tends to be negligible correlation between di erent batches. Since there are about 104 regeneration points within each run in the GI/M/s/0 model, it is evident that the batches should be very nearly independent in those cases. It would also be possible to use other procedures, such as overlapping batch means or weighted batch means; see Meketon and Schmeiser (1984) and Bischak, Kelton and Pollock (1993). Our variance reduction technique does not require that we use simple batch means. 7 Given that we do use simple batch means, we estimate the covariance Cov(X;Y ) for random variables X and Y by Ĉ(X;Y ) = 1 n 1 n Xi=1(Xi X)(Yi Y ) ; (31) where (Xi; Yi) are the batch means from the ith batch and ( X; Y ) are the averages of the batch means. The variance estimate V̂ (X) is Ĉ(X;X). If we want the variance and covariance estimates themselves to have lower variance, in addition to making longer runs, we need to let the number of batches grow as we increase the run length; see Glynn and Whitt (1991). (Alternatively, we could make independent replications.) As described in Glynn and Whitt (1991), the standard deviations of the variance estimators V̂ (B̂N (t)) and V̂ (B̂I(t)) are aboutp2=(n 1) times their means. To derive this relation, we assume that the usual asymptotic normality for estimators as the run length grows is valid. Under that condition, if the run is su ciently long, then for the estimators B̂N (t) and B̂I (t) the batch means will be approximately distributed as n i.i.d. normally distributed random variables, each with mean B and variance n 2=T , where 2 is the asymptotic variance ( 2 N or 2 I ) and T is the total run length. Then the sample variance V̂ (X) is approximately distributed as n 2 2n 1=T (n 1), where 2n 1 is a chi-square random variable with n 1 degrees of freedom. The random variable 2n 1 has mean n 1 and variance 2(n 1), so that the sample variance V̂ (X) has mean n 2=T (equal to the variance of the batch mean) and variance 2n2 2=T 2(n 1). Hence, the standard deviation of V̂ (X) is indeed approximately p2=(n 1) times its mean. For n = 20, the SD V̂ (X) EV̂ (X)=3. this analysis shows how much statistical precision we can expect from the variance estimators. Obviously we can reduce the standard deviation of the variance estimator if we increase the number of batches. However, the analysis only remains correct if the batches remain approximately independent. 3.3. Estimating p We estimate p by using formula (19) with the estimates for r and , i.e., p̂ = r̂(r̂ ̂) 1 + r̂2 2̂r̂ ; (32) where r̂2 = V̂ (Y ) V̂ (X) and ̂ = Ĉ(X;Y ) qV̂ (x)V̂ (Y ) : (33) By the same argument used to establish (19), p̂ is the value of p minimizing the sample variance V̂ (p) 1 n 1 n Xi=1[pXi + (1 p)Yi (p X + (1 p) Y )]2 : (34) Hence, p̂ can also be found by computing V̂ (p) and searching for the minimum p. To avoid bias in the step, we should estimate p using a separate run, but we do the estimation of p using the same run that we estimate B. This procedure clearly induces some underestimation of the variances. In general, it is important to be aware of this possibility, but in our context we found the e ect to be minor. To reach this conclusion, we tested the procedure by performing multiple independent replications. We found that the estimates of p from any of several di erent runs produced similar variance reduction. Moreover, the uctuation in variance estimates typically was greater between runs than within one run over the various optimal p values. We will illustrate this phenomenon later. To further support estimating p within the same run that we estimate B; 2 X and 2 Y , we show that the procedure tends to be asymptotically correct as the sample size, say t, increases, provided the number of batches increases with t. Now X and Y in (17) should be replaced by stochastic processes X(t) and Y (t) (e.g., they might be sample means). In great generality, V arX(t) ! 0 and V arY (t) ! 0 as t ! 1, but tV arX(t) ! 2 X , tV arY (t)! 2 Y and tCov(X(t); Y (t))! X Y as t!1, so that V arY (t)=V arX(t)! r and Corr(X(t); Y (t))! as t!1. Under these limits, p̂(t)! p and R(p̂(t))! R(p ) as t!1. 8 In a speci c application we have a xed small r. By the analysis in (25){(27), we need to ensure that the error in ̂r̂ is then suitably small compared to r. If r is extraordinarily small, this step could be di cult, but then 2 Y itself should be small. 3.4. Linear Control Variates The standard theory of linear control variates implies that the optimal value of a1 in the linear control estimator (12) is a 1 = Cov(B̂N (t); ̂(t))=var(̂(t)) (35) and similarly for the others, e.g., see p. 96 of Glynn and Whitt (1989) and references cited there. The variance reduction (ratio of new to old variance) provided by using the optimal linear control is 1 2, where is the correlation between the original estimator and the control. We obtain our linear control estimators by estimating a 1 in (35) by estimating the quantities in the numerator and denominator. In the GI/GI/s/0 model the interarrival times and service times are independent, so that it su ces to treat the two controls separately. For the grand combination estimator B̂GC(t) in (14), the variance evidently is not in general a convex function of the parameters (p; b1; b2). Hence, we found the optimal value of b1 and b2 for each of a set of p values and then optimized over p, again all within one run. This was easily done, requiring negligible computation time, for p values from 0 to 1 increasing by 0.01. 4. Simulation Experiments We will illustrate how the variance reduction procedures perform by considering several examples. 4.1. The GI/GI/s/0 Model We rst consider the standard s-server loss model having no extra waiting space and i.i.d. service times that are independent of i.i.d. interarrival times. We rst let s = 100 and = 1. We consider three values of : = 140 (heavy loading), = 100 (normal loading) and = 80 (light loading). We do simulation experiments for these three cases using exponential (M ) and hyperexponential (H2, mixture of two exponentials) distributions for the interarrival times and service times. The exponential distribution has squared coe cient of variation (SCV, variance divided by the square of the mean) 1, while the H2 distribution we consider has SCV 10. We let c2a and c2s denote the SCV of the interarrival times and service times, respectively. Our H2 distribution has \balanced means," i.e., it has density f(x) = p 1e 1x + (1 p) 2e 2x; x 0 ; (36) with p 1 1 = (1 p) 1 2 . The other two parameters are determined by the mean m and the SCV c2. In particular, p = [1+p(c2 1)=(c2 + 1)]=2 (37) and p 1 1 = (1 p) 1 2 = m=2 : (38) The H2 distribution is a good highly variable distribution to consider for service times because it represents the mixture of two exponential distributions with di erent means. Such mixtures naturally arise when the customers being considered actually represent the combination of two or more di erent classes with di erent characteristics. Hyperexponential distributions also are natural to consider for arrival processes too, because they are equivalent to on/o arrival processes, i.e., a Markov modulated Poisson process with a two-state environment: There is an exponential holding time in each environment state; in one environment state there are no arrivals, while in the other environment state arrivals occur according to a Poisson process. For these particular models it is not di cult to calculate the blocking probability analytically. First, for the M/GI/s/0 model, the blocking probability can be calculated easily from Erlang's formula. Second, for the H2=M=s=0 model and H2=H2=s=0 model, the blocking probability can be calculated exactly by using 9 continuous-time Markov chains. For s = 100, the number of states needed for the H2=H2=s=0 model is of order 104, which is manageable. However, it is clear that the variance reduction behavior will be similar for other distributions for which it is not possible to compute the blocking probability analytically. We use the analytic results for Poisson arrivals to help validate our results. In this example, we let each simulation run length be 200,000 time units, which corresponds to an expected number of arrivals equal to 200,000 , (2 107 when = 100). We use 400 batches and delete an initial period of length 50 to allow the system to approach steady state. Simulation results are displayed in Table 1. In each case we display the natural estimate B̂N (t) and its estimated standard deviation SD B̂N (t). We also display the estimated variance ratios V arB̂N (t)=V arB̂(t) for several alternative estimators B̂(t). In our simulation experiments we actually considered combination and linear control estimators based on B̂S (t) as well as B̂N (t), but as in our previous paper we found that B̂S (t) and B̂N (t) tend to be interchangeable, so we only report results for B̂N (t). As in Srikant and Whitt, we nd that the performance of the estimators in GI/GI/s/0 model depends on the loading. Roughly speaking, the loading can be regarded as light, normal or heavy when < s 2p , s 2p s+ 2p , or > s+ 2p . A starting point is the result from our previous paper that B̂I (t) is much more e cient than B̂N (t) in heavy loading, much less e cient in light loading, and about equally e cient in normal loading. Here are the conclusions we draw from Table 1: First, the e ciency of the grand combination estimator B̂GC(t) and the combination estimator B̂C (t) in (9) are essentially the same. Thus, we conclude that the combination estimator already includes the bene ts from using controls and 1. In every case, the combination estimator is at least as e cient as all the other estimators. For each other estimator, there is some case in which the combination estimator is substantially better. As indicated earlier, the variance reduction is dramatic in heavy loading. This is due in part to the advantage of the indirect estimator, but the combination feature also contributes signi cantly. The variance reduction provided by the combination feature is also substantial in normal loading. In normal loading the combination improves the indirect estimator more than the indirect estimator improves the natural estimator (but much of the gain would be captured by the indirect estimator plus a linear control). The variance reduction tends to increase as the service time gets more variable. The e ect of arrival process variability is less clear. The linear control estimators B̂LN (t) and B̂LI (t) consistently o er improvement over the basic estimators B̂N (t) and BL(t), respectively. In heavy loading, B̂LI(t) is nearly as good as B̂C(t), while in light loading B̂LN (t) is nearly as good as B̂C (t). In normal loading B̂C(t) seems to be slightly better than B̂LN (t) and B̂LI(t), with B̂LI(t) being slightly better than B̂LN (t). A key point is that everything is not captured by the linear controls: The di erences between the natural and indirect estimators are not removed by simply using linear controls. In Table 2 we give variance ratios for the M/M/s/0 model with = 1 as a function of and s. The intent here is to show the impact of system size as well as loading. With one exception (normal loading = 100 to 1000), large s means larger variance ratios, but the loading is clearly a more important factor. If we hold the blocking probability xed, then size becomes a clearer factor; then larger size consistently yields larger variance ratios. In order to validate our results, we performed independent replications. Table 3 displays the sample means and sample standard deviations of key quantities for four cases in Table 1 based on 20 independent replications or runs each of length t = 104 using 20 batches. (Thus the total simulation time and the length of each batch is the same.) In each case, the sample mean is the average of the 20 numbers obtained from the 20 runs, while the sample standard deviation is the estimated standard deviation of the quantity from a single run (not the estimated standard deviation of the sample mean, which would be smaller). Thus, the standard deviation estimates show the variability of the estimates from each run. First, in these cases the exact blocking probabilities can be computed from Erlang's blocking formula. The exact blocking probabilities are B = 0:07570 for = 100 and B = 0:30124 for = 140. From Table 3 we see that there is no discernible bias in the estimators B̂N (t) and B̂C (t). The standard deviations of the estimators B̂N (t) and B̂I(t) are also consistent with the predictions in Srikant and Whitt, which justi es our choice of run length. Note that the standard deviation of the blocking probability is about 1% of the estimated value, whereas the standard deviations of the standard deviation estimates are larger (relatively); e.g., for the natural estimator they are about 15%. Similarly, the standard deviations of the estimates p̂, r̂, 10 ̂ and R̂ are also larger. The main conclusions about variance reduction can be validated by comparing the sample means of the estimated standard deviations (of B̂N (t) and B̂C (t)) to the sample standard deviations of the means. Table 3 shows that these are close. The sample means of the estimated standard deviation of B̂C(t) consistently slightly less than the sample standard deviation of the estimated mean of B̂C (t), revealing the underestimation of variance that occurs due to estimating p in the same run. Table 3 shows that the average predicted variance reductions in the four cases were 13, 410, 33 and 1634, respectively. After squaring the ratios of the displayed standard deviations, we see that the corresponding ratios of the sample variances of the means are 10, 367, 23 and 1067, respectively. Thus, the predicted variance reduction from the output of one run is slightly optimistic, but clearly genuine. In order to gain further insight into the e ect of estimating the optimal weight p from the same run in which we estimate B̂N (t) and B̂I(t), we plot in Figure 1 the variance V (p) as a function of p for 5 di erent replications of the M=H2=s=0 heavy-loading ( = 140) example from Tables 1 and 3. The example shows that the estimate p̂ from any one run would yield similar predicted variance reduction in any other run. Figure 1 is consistent with the slight underestimation of variance observed in Table 3. A major conclusion of our previous paper was that, unlike the blocking probabilities themselves, the statistical precision of the basic estimators B̂N (t) and B̂I(t) in the M/GI/s/0 model strongly depends on the holding-time distribution beyond its mean. However, we observed a near insensitivity to the holdingtime distribution (beyond the mean) in the standard deviations of the estimators B̂C(t) and B̂GC(t) in the M/G/s/0 model. In Section 5 we show that the insensitivity is asymptotically correct as ! 1. From statistical analysis of the simulation results, we are able to conclude, with very high probability, that in general full insensitivity does not hold for the standard deviations of the estimators B̂C (t) and B̂GC(t), but it is a close approximation. To illustrate, in Table 4 we display the sample means of four estimators based on 10 runs of length 10,000 each for the M/G/s/0 model with two holding-time distributions. The rst holding time distribution is Erlang (E10) with c2s = 0:1, while the other is H2 with c2s = 10:0. As before, we consider heavy loading, normal loading and light loading; i.e., we consider s = 100, = 1 and three values of : = 140, = 100 and = 80. The estimated standard deviations are quite close for B̂C(t) and B̂GC (t), but not for the other two estimators. 4.2. Dependent Interarrival and Service Times To show that the estimation procedure also applies to more complex models, we also considered the G/G/s/0 model in which the service times and arrival process are allowed to be dependent. Speci cally, we let the service time of each customer be exactly ( = ) times the last interarrival time. This gives the service time the correct mean 1, but makes these two variables strongly dependent (correlation 1). No extra work was required to implement the various estimators for these modi ed models. Moreover, essentially the same variance reduction behavior was observed in these modi ed models as was observed for the standard models. Intuitively, having service times positively correlated with interarrival times should reduce congestion, but our simulation results showed that the blocking probabilities in modi ed M/M/s/0 and H2=H2=s=0 systems did not di er too much from the blocking probabilities in the standard M/M/s/0 and H2=H2=s=0 systems. For example, for the modi ed M/M/s/0 model with s = 2, = 1 and = 2, the estimated blocking probability was 0.380 compared to 0.400 in the standard M/M/s/0 model; for s = 100, = 1 and = 140, the estimated blocking probability was 0.3008 compared to 0.3012. The decrease was statistically signi cant, but not large. 4.3. Loss Networks To show that the estimation procedures also apply to more elaborate loss networks, as in Ross (1995), we also considered three-link triangle networks. Direct tra c is o ered to each link, but if these requests are blocked, then they can be routed on the other links if there is space. We assume that each request uses one circuit, with alternate routed tra c requiring one circuit on both of the other two links. Alternate routed 11 calls hold the circuits on both links for the duration of the call. Both circuits become free when the call is complete. We also allow trunk reservation on each link. A trunk reservation parameter tri on link i means that alternate routed tra c is only accepted on that link if there are at least tri free circuits on that link. There must be su cient free capacity on both links in order for a candidate alternate routed call to be admitted. We consider examples with independent Poisson call arrival processes and exponential call holding times. For this continuous-time Markov chain model, we used uniformization to construct an associated discretetime Markov chain with the same steady-state probabilities; e.g., see Keilson (1979). To simulate the full process, we would have to include i.i.d. exponential times between transitions (real or ctitious), but since we only wanted to estimate steady-state quantities, we directly simulated the discrete-time Markov chain. (This step itself serves to reduce variance; see Fox and Glynn (1986).) In the speci c examples we now discuss, the three links all have capacity 100 and trunk reservation parameter tr, and the holding times all have mean 1. The model is thus speci ed by the three arrival rates i and the common trunk reservation parameter tr. We apply the estimation procedures to estimate the blocking probabilities of each class and the overall (total) blocking probability. The results for six cases are displayed in Table 5. These results were obtained from single runs with 107 arrivals after a warmup period of 105 arrivals. Since the simulation is in discrete time, the integral in (4) is replaced by a sum. In the rst two cases the arrival rate is 140 on each link, with the common trunk reservation parameter being 5 in the rst case and 0 in the second. If the trunk reservation parameter is high enough, then the example becomes like three separate links in heavy loading. However, the rst example with tr = 5 di ers noticeably from the M/M/s/0 heavy-loading cases in Tables 1{3. The combination estimator yields signi cant variance reduction when tr = 5, but not as great as for only one link. However, there is a dramatic change when tr = 0. Evidently, the alternate routed calls make the occupancy levels for the individual classes much more variable, so that the indirect estimator becomes less e cient. The combination estimator does no worse than the natural estimator, but it only provides signi cant improvement for the overall blocking probability. This case also shows that the correlation can be positive. (Positivity was con rmed by independent replications.) The third case in Table 5 is a balanced network with normal loading. In this case, the trunk reservation parameter tr = 5 is su ciently small that the model is very di erent from three separate links. Nevertheless, the combination estimator reduces variance by factors of about 4 and 13 for the individual classes and the total network. In this case the variance reduction is primarily due to the combination procedure (R < r2). The remaining three cases in Table 5 are unbalanced networks. For these cases, the advantage of the combination estimator uctuates widely. Moreover, it is di cult to predict in advance whether the indirect or natural estimator is better. These examples show that the combination estimator can be good even if it just automatically selects the better of these two basic estimators. Of course, it does this and somewhat better still. 4.4. Finite Waiting Rooms Our nal example involves the addition of a nite waiting room. The addition of a nite waiting room clearly has negligible e ect in light loading, but it can have a dramatic impact under heavy loading. To illustrate, we rst consider the M/M/s/k model with s = 100, k = 100 and = 140. This is the same heavy-loading example considered in Tables 1{3, except that we have added a waiting room of size 100. The waiting room slightly reduces the blocking probability from 0.3012 to 0.2857, but it has an enormous impact on the variance reduction. Since the number of busy servers remains at 100 much more frequently, the variance of the indirect estimator drops dramatically. In several independent runs of length 104, the estimated standard deviation of the indirect estimator was 3 106 while the estimated standard deviation of the estimated natural estimator was 1:2 10 3. This is a variance reduction of 1:6 105. In this example, there was not much for the combination estimator to add. It yielded essentially the same estimated mean and standard deviation, and ̂ = 0:054. The corresponding example with hyperexponential service times having c2s = 10 yielded a variance reduction for the indirect and combination estimators of 1:6 106. In this case the combination estimator itself provided slight further improvement; the estimated correlation was ̂ = 0:192. 12 With a nite waiting room, the indirect estimator can be much better than the natural estimator even with only a single server. To illustrate, we consider an M/M/1/k model with = 1:0 and k = 100. Based on runs of length 107, the variance reductions for the indirect and combination estimators were both 7:8 104 when = 2:0 and were 17.6 and 25.8, respectively, when = 1:1. 5. Heavy Loading Asymptotics We know that the indirect estimator becomes much more e cient than the natural estimator in heavy loading. The examples have shown that the combination estimator can contribute even more variance reduction in heavy loading. From (24) in Section 2, we know that the additional variance reduction provided by the combination estimator is then (1 2) 1, where is the correlation between the basic estimators B̂N (t) and B̂I(t), because r is small. A natural question, then, is: What is ? In this section we identify the limit of as ! 1. We rst show that in the G/G/s/0 model that the correlation between the indirect estimator and another estimator approaches 1 as !1. This other estimator is the time congestion estimator B̂T (t) = t 1 Z t 0 1fN(u)=sgdu : (39) The time-congestion estimator was considered in Srikant and Whitt, where it was found to behave similarly to the natural and simple estimators. Since the time-congestion estimator is similar to the natural estimator, the simple analysis in this case supports our intuition in the actual case of interest (with B̂N (t)). When the arrival rate becomes very large, the system is nearly full all the time. There tends to be only one free server for a short time after each service completion. Therefore, B̂I (t) = 1 n̂(t) 1 (s 1) BT (t) : (40) Hence we have established our rst result. Theorem 1. In the G/G/s/0 model, lim !1Corr(B̂I(t); B̂T (t)) = 1 : We now study (for the natural estimator) in the G/G/s/0 model with = 1. We assume that the arrival process is an ergodic stationary point process independent of the service times, which form a stationary sequence. We let ! 1 by scaling the interarrival times. We assume that the service times satisfy a functional central limit theorem (FCLT), e.g., see Billingsley (1968) and Whitt (1980). Let ) denote convergence in distribution. Then the assumed FCLT is n 1=20@bntc Xi=1 Si nt1A)pc2sW (t) as n!1 ; (41) where fW (t) : t 0g is a standard (drift 0, variance 1) Brownian motion or Wiener process. This condition is satis ed in the GI case provided that the service-time cdf has nite variance, in which case c2s is the SCV. As ! 1, the system alternates between s servers busy and (s 1) servers busy. After each service completion, there is a brief idle period until the next arrival. As !1, this idle period tends to have the stationary-excess distribution of the interarrival-time distribution. To obtain a meaningful statement, we should consider the system as ! 1 with time rescaled so that the arrival rate is 1. Then, if Fa is the interarrival-time cdf with mean 1, then the idle time cdf approaches Fae(t) = Z t 0 [1 Fa(u)]du ; t 0 ; (42) which has mean m2=2, second moment m3=3 and, thus, SCV c2ae = 4m3 3m22 1 ; (43) 13 where mk is the kth moment of Fa, with m1 = 1. This occurs as ! 1, because there are then many arrivals between each service completion. This makes the epoch of a service completion fall at an arbitrary time in the stationary point process. We also assume that successive idle times become i.i.d. as !1, which will occur if the arrival process is only weakly dependent. Now let fCs(t) : t 0g be the counting process of successive busy periods. As ! 1, fCs(t) : t 0g will be the superposition of s stationary point processes generated by the service times at each server. By (41) and Section 7 of Whitt (1980a) (recall = 1), n 1=2(Cs(nt) snt))psc2sW (t) as n!1 : (44) We rst consider the natural estimator. Theorem 2. In the G/G/s/0 model with = 1, lim !1 (1 B̂N (t)) = t 1Cs(t) w.p.1. (45) Proof. As !1, there is one admitted arrival in each busy cycle, and the busy cycles approach the busy periods. Thus, (1 B̂N (t)) Cs(t) A(t) ! Cs(t) t as !1 w.p.1 by the ergodic theorem for the arrival process. 2 We now consider the indirect estimator. Theorem 3. In the G/G/s/0 model with = 1, lim !1 2(1 B̂I (t)) s = t 1Y (t) t 1 Cs(t) Xi=1 Xi w:p:1 ; (46) where fXig is an i.i.d. sequence independent of fCs(t) : t 0g with Xi distributed as Fae in (42). Proof. First note that (1 B̂I (t)) = n̂(t) : Then note that for large n̂(t) s t 1 Ds(t) Xi=1 Zi where Zi is the length of the ith idle period and Ds(t) is the counting process of the busy cycles. Hence, when we rescale by multiplying by , 2(1 B̂I(t)) s t 1 Ds(t) Xi=1 Zi As !1, the variables Zi become distributed the same as the variables Xi, f Zig becomes independent of Ds(t) and Ds(t)! Cs(t). 2 We now establish a joint CLT for (Cs(t); Y (t)) where Y (t) is the limit in (46). Theorem 4. In the G/G/s/0 model with = 1, if !1 and then t!1, then t 1=2(Cs(t) st ; Y (t) st)) (psc2sN1(0; 1) ; psc2sN1(0; 1) +psc2aeN2(0; 1)) (47) for c2s in (41) and c2ae in (43), where N1(0; 1) and N2(0; 1) are independent standard (mean 0, variance 1) normal random variables. Proof. Our starting point is the joint FCLT n 1=2(bntc Xi=1 Xi nt ; Cs(nt) snt)) (pc2aeW1(t);psc2sW2(t)) as n!1 14 in the function space D[0;1), where W1(t) and W2(t) are independent standard Brownian motions. This initial FCLT holds because of the independence which holds as ! 1. Next we apply the continuous mapping theorem using a random time change, as in Section 5 of Whitt (1980a) and the projection at t = 1 to obtain (47); i.e., we consider the process n 1=2(Pbntc i=1 Xi nt) evaluated at the random time Cs(nt)=n and then add the process n 1=2(Cs(nt) snt) to obtain n 1=2(PCs(nt) i=1 Xi snt ; Cs(nt) snt) ) (pc2aeW1(t) +psc2sW2(t) ; psc2sW2(t)) as n!1 : Since Y (t) = Cs(t) PXi, we have the desired result. 2 The limit in (47) leads to the approximations Cov(Cs(t); Y (t)) stc2s (48) and Corr(Cs(t); Y (t)) s c2s c2s + c2ae (49) assuming that t is suitably large. Moreover, Theorems 2 and 3 lead to the approximation Corr(B̂N (t); B̂I(t)) Corr(Cs(t); Y (t)) : (50) Combining Theorems 2{4 and (48){(50), and employing uniform integrability arguments (which we omit), see p. 32 of Billingsley (1968), we obtain the following result. Theorem 5. In the G/G/s/0 model, assuming uniform integrability of B̂N (t)2 and B̂I(t)2, lim t!1 lim !1Corr(B̂N (t); B̂I(t)) = s c2s c2s + c2ae : (51) It is interesting to see how the limit in (51) behaves in special cases. For anM arrival process c2ae = c2a = 1; the minimum value of c2ae is 1=3 for a D arrival process. For the M/G/s/0 examples in Table 1, c2s = 1 and c2s = 10, so that ! 1=p2 0:707 and p10=11 0:953, respectively. These limiting formulas agree remarkably well with the estimates ̂ = 0:710 and ̂ = 0:937 in the heavy-loading cases of Table 1. Formula (51) seems to provide useful rough approximations even outside the heavy-loading regime, as shown by the normal and light loading cases in Table 1. It is signi cant that Theorem 5 is consistent with the approximate insensitivity we observed in Section 4.1 for the combination estimator in the M/G/s/0 model. Combining (17) of Srikant and Whitt with (51) above, we obtain lim !1 V arB̂C(t;G=G=s=0) V arBC (t;M=M=s=0) (1 2G) (1 2M ) (c2a + c2s) 2 c2ae c2ae + c2s (c2a + c2s) : where and are xed. In the case ofM arrivals, c2ae = c2a = 1, so that ratio becomes 1, showing asymptotic insensitivity in the M/G/s/0 model. We can also use Theorems 2{5 to obtain insights about the performance of the linear control estimators in heavy loading. Note that the limit Cs(t)=t in Theorem 2 is essentially the estimator ̂(t). Thus, as ! 1, Corr(B̂N (t); ̂(t)) ! 1 and B̂LN using the control becomes a signi cant improvement over B̂N . On the other hand, Corr(B̂N (t); ̂(t)) ! 0, so that is not an e ective control for large . Since Corr(B̂N (t); ̂(t))! 1 as !1, we see that B̂LI using is asymptotically equivalent to B̂C(t) as !1, i.e., Corr(B̂I(t); ̂(t)) has the same limit as in (51), which is consistent with the numerical results in Table 1. We conclude this section by noting that the uniformization approach for treating Markovian networks in Section 4.3 makes much more strongly negatively correlated. With the discrete time framework, the 15 random variables Xi in Y (t) in (46) become replaced with constants. This enables us to conclude the following. Theorem 6. In the M/M/s/0 model, if the discrete-time process is simulated using uniformization, then lim t!1 lim !1Corr(B̂N (t); B̂I(t)) = 1 : To illustrate, we used the network simulation program with one link to simulate the M/M/s/0 model with s = 100, = 1 and 2 107 arrivals with 2000 batches. The estimates of were 0:690, 0:804, 0:874 and 0:967 for arrival rates 140, 500, 1000 and 5000, respectively. For the M/M/s/0 with = 1 s = 2 and = 100, we obtained ̂ 0:9993. 6. The Importance of Knowing and The estimators B̂I (t), B̂C (t) and B̂GC(t) all take advantage of our knowledge of and . We would like to achieve similar variance reduction using estimates of and , (i.e., via the modi ed indirect estimator B̂M (t) in (14) and the associated modi ed combination estimator B̂MC (t)) because then the estimators could be used as part of real-time estimation procedures based on actual system measurements (in which case and would have to be estimated too), but unfortunately the good performance of the indirect and combination estimators evidently depends on knowing and . This is essentially the same conclusion reached in Glynn and Whitt (1989) about indirect estimation via L = W . It is important to note that some attempts to achieve e ective variance reduction when we do not know and are mere illusions. In order to estimate the nal variances of our estimators, we consistently work with batch means. Thus the modi ed indirect estimator B̂M (t) in (14) is obtained by taking estimates of n̂i(t) and ̂i(t) within each batch and then forming the average of the ratios n 1Pni=1(n̂i(t)= ̂i(t)). Instead, we could determine the overall average ̂(t) for the entire run and use that in each batch with the batch means of n̂i(t), i.e., n 1Pni=1 n̂i(t)= ̂(t). This alternative approach yields spectacular improvement in the direct sample estimates of the estimator variance in heavy loading, but the observed gain is not genuine. The actual estimates produced by this new version of the modi ed estimator B̂M (t) turn out to be very similar to the estimates from the previous modi ed estimator. The putative decrease in sample variance occurs because we have ignored the strong positive correlation between batches caused by using the common factor ̂(t) in each batch. The lack of variance reduction is con rmed when we estimate the variance by performing independent replications. To illustrate, we give an example. Consider the M/M/s/0 model with = 140, s = 100 and = 1, as in Tables 1{3. Since we are in heavy loading, we know that B̂I(t) will have lower variance than B̂N (t), and we would like to achieve this gain with B̂M (t). In a run of length 10,000, we obtain estimates B̂N (t) = 0:3020, SD(B̂N (t)) = 0:000993, while SD(B̂I (t)) = 0:000073. The two modi ed estimators yielded estimates 0.301992 and 0.302004, and sample standard deviations 0.000994 and 0.000073. So at rst glance, it looks as if we have succeeded with the modi ed estimator using the ̂(t) for the entire run. However, multiple independent replications show that the real standard deviation for both modi ed estimators is actually about SD(B̂N (t)); just as is the case for B̂M (t). The situation is di erent for the natural estimator B̂N (t) in (6). If we know , then we are able to use the simple estimator B̂S (t) in (7) instead of the natural estimator. However, we have found that the role of known and is very di erent in these cases. On the one hand, in our previous paper we found that the estimators B̂S(t) and B̂N (t) are almost identical (both in actual value and in variance), so that they can be used interchangeably with negligible di erence. On the other hand, B̂I(t) and B̂M (t) turn out to be very di erent, so that B̂M (t) fails to capture the advantage of B̂I(t) in heavy loading. Similarly, B̂MC(t) fails to capture the advantages of B̂C (t). There is a basis for understanding why these estimators perform as they do in the theory of indirect estimation in Sections 1 and 8 of Glynn and Whitt (1989). There, generic estimators that do not use known parameters are called direct estimators; while the corresponding ones that do are called indirect estimators. The relation between the e ciencies of these estimators is characterized in Theorem 9 of Glynn and Whitt (1989). For completeness, we restate it here, with minor modi cation in notation for our setting. The indirect estimation framework involves estimators (x̂t; ŷt) of (x; y) such that t1=2(x̂t x; ŷt y) ) N (0; C) ; 16 where ) denotes convergence in distribution, x and y are vectors, N (0; C) is a random vector with a multivariate normal distribution having zero means and covariance matrix C. Let x and y be k and l dimensional, respectively. Let C11; C12; C21 and C22 be the k k, k l, l k and l l submatrices of C associated with x and y. Our goal is to estimate f(x; y) for a smooth real-valued function f , i.e., with gradient 5xf(x; y) = @f @x1 (x; y); : : : ; @f @xk (x; y) t and 5yf(x; y) = @f @y1 (x; y); : : : ; @f @yl (x; y) t : The direct estimator is ẑD t f(x̂t; ŷt) and the indirect estimator is ẑI t f(x; ŷt). Let z f(x; y). Let op(t 1=2) refer to a term which converges to 0 in probability after dividing by t 1=2. Here is the result: Theorem 7. (Glynn and Whitt (1989)) Under the assumptions above, (a) t1=2(ẑD t z)) N (0; ̂2 D) as t!1, (b) t1=2(ẑI t z)) N (0; ̂2 I) as t!1, (c) ẑD t = ẑI t + (x̂t x)5x f(x; y) + op(t 1=2) as t!1, where ̂2 I = 5yf(x; y)tC225y f(x; y) and ̂2 D = 5xf(x; y)tC115x f(x; y) +5yf(x; y)tC215x f(x; y) +5x f(x; y)tC125y f(x; y) +5yf(x; y)tC225y f(x; y) : From Theorem 7 we see that the asymptotic e ciency (asymptotic variance) of the estimators ẑD t and ẑI t depends on the gradient 5f(x; y) and the covariance matrix C. We now apply Theorem 7 to our estimators. First, we relate the natural and simple estimators. The natural estimator B̂N (t) is the ratio of the two estimators L(t)=t and ̂(t). Therefore, the connecting function is f(x; y) = y=x where x = and y = B. The associated gradient of f is 5 f(x; y) = ( y=x2; 1=x) = ( B= ; 1= ) : (52) Theorem 7 (c) implies that B̂N (t) = B̂S (t) B 1(̂(t) ) + op(t 1=2) : (53) By assumption, t1=2(̂(t) ) N (0; C11), which is equal in distribution to pC11N (0; 1). When A(t) is a Poisson process, C11 = . Using this as an approximation, we have B̂N (t) B̂S(t) B p tN (0; 1) ; (54) which tends to be small compared to B when t is large. For instance, in the examples in Table 1 we had t 106, so that 1=p t = 10 3 there. Hence, we anticipate that B̂N (t) B̂S (t) when is large. We can also reach this conclusion by focusing on the variances, using Theorem 1(a) and (b), but this essentially repeats what was done in Section 7 of Srikant and Whitt, so we stop. Now we consider indirect estimation in the context of the indirect estimator B̂I(t). Since B = 1 n= for = = , the connecting function is f(x1; x2; y) = 1 (y=x1x2) for x1 = ; x2 = 1 and y = n. The gradient is 5 f(x1; x2; y) = y x21x2 ; y x1x22 ; 1 x1x2 = n 2 1 ; n 2 ; 1 1 : (55) 17 Theorem 1 (a) and (b) imply that V arB̂I(t) = 1 2V ar(n̂(t)) (56) and V arB̂M (t) V arB̂I(t) = n2 2 2V ar(̂(t)) + n2 2 2V ar(̂ 1(t)) + 2n2 3 Cov(̂(t); ̂ 1(t)) 2n 2Cov(̂(t); n̂(t)) 2n 1 2Cov(̂ 1(t); n̂(t)) : (57) Assuming that the arrival process and holding times are mutually independent, we expect the covariance term Cov(̂(t); ̂ 1(t)) in (27) to be negligible. Assuming that = 1 and n, we see that the prefactors of all terms not involving ̂ 1(t) are about the same size, namely, 1= 2. However, the prefactors of terms involving ̂ 1(t) are larger, showing the potential for the service times to have a greater in uence. The case of primary interest, yielding the big advantage for B̂I (t) and B̂C (t), is heavy loading. In heavy loading we will have n̂(t) s and V ar(n̂(t)) much reduced compared to V ar(̂(t)) and V ar(̂ 1(t)). Similarly, covariance terms involving n̂(t) should be negligible. Thus, we can apply (57) to deduce that V arB̂I (t) should be much smaller than V arB̂M (t), as is borne out by experiments. 7. Deterministic Service Times We can obtain additional insight into the variance reduction by considering the special case of the G/D/s/0 model, which has deterministic service times. Without loss of generality, let the service times all be of length 1. In general, we have the basic conservation law N (t) = N (0) +A(t) D(t) L(t) ; (58) where D(t) records the number of departures in [0; t], i.e., the number of admitted arrivals that have completed service. For simplicity, assume that N (0) = 0. Then, since we have deterministic service times, D(t) = A(t 1) L(t 1) : (59) Combining (58) and (59), we obtain N (t) = [A(t) A(t 1)] [L(t) L(t 1)] (60) and n̂(t) 1t Z t 0 N (u)du = 1t Z t t 1[A(u) L(u)]du : (61) since A(t) and L(t) increase as t increases, (61) implies that n̂(t) A(t) t L(t) t : (62) We rst apply (62) to analyze the modi ed estimator. In our context, assuming (62), B̂M (t) = 1 tn̂(t) A(t) = L(t) A(t) = B̂N (t) ; (63) so that we can see why B̂M (t) tends to have approximately the same variance as B̂N (t) in general. In the case of deterministic service times, equation (57) also simpli es because ̂ 1(t) is essentially constant. Equation (57) becomes V arB̂M (t) = V arB̂I(t) + n2 4 V ar(̂(t)) 2n 3Cov(̂(t) ; n̂(t)) = 1 2V ar n̂(t) n̂(t)! : (64) 18 If in addition = n, then (62) and (64) imply that V arB̂M (t) 2V arL(t) t V arB̂S (t) ; (65) which further connects B̂S (t); B̂N (t) and B̂M (t). From (62), we can also characterize the variance of the simple estimator in heavy loading. Assuming that n̂(t) n, B̂S (t) = L(t) t (A(t) t n so that V arB̂S(t) = V ar̂(t) 2 : If fA(t) : t 0g is a Poisson process, then V arB̂S (t) t= . In contrast, V arB̂I(t) = V ar n̂(t) 2 : Since ̂(t) will be more variable than n̂(t) in heavy loading, we see the advantage of the indirect estimator. 8. Correlation Inequalities In Section 5 we identi ed the limiting correlation between B̂N (t) and B̂I (t) as the load increases. In this section we establish qualitative results for all loadings. We provide theoretical evidence showing that the estimators B̂N (t); n̂(t); B̂I(t) and ̂(t) are indeed all positively correlated in a large class of loss models (for any loading), which is consistent with intuition. (Unfortunately we are unable to treat ̂ 1(t).) In order to avoid having to treat ratios of random variables, we consider the estimator B̂S(t) = L(t)= t in (7) instead of B̂N (t). As indicated earlier, B̂S (t) and B̂N (t) are very similar. The speci c class of models we consider here we denote by DFR/IFR/s/0; it is the special case of the general GI/GI/s/0 model in which the interarrival-time distribution is DFR (has decreasing failure rate) and the service-time distribution is IFR (has increasing failure rate). If F (t) is the cumulative distribution function with density f(t), then the failure rate is r(t) = f(t)=(1 F (t)) : (66) The DFR (IFR) property means that r(t) is a decreasing (increasing) function; see Barlow and Proschan (1975). The DFR class includes the hyperexponential (Hk, mixture of k exponentials) distribution, while the IFR class includes the Erlang (Ek, convolution of k identical exponentials) distribution. Both include the exponential distribution, so that the M/M/s/0 (Erlang) model is covered. However, the examples with H2 service times in Section 3 are not included. Here is our main correlation inequality result. Theorem 8. In the DFR/IFR/s/0 model, the estimators B̂S (t); n̂(t), ̂(t) and B̂I (t) are all positively correlated. We prove Theorem 8 by representing the DFR/IFR/s/0 model as a limit of discrete-time models, and by establishing a related result for discrete-time models. Theorem 1 of Whitt (1980b) can serve as the connecting continuity theorem. Related continuity results appear in Kalashnikov and Rachev (1990). We now consider the discrete-time loss model. As before, there are s servers with no extra waiting room. We let arrivals occur before services in each period, so that blocking is determined before processing any service times. (In a limiting continuous-time model, usually two events never occur at the same time.) Let ak be the potential number of arrivals in period k. (This could be only 1 or 0, but we allow other possibilities.) Let ckj be the potential service completion indicator variable for server j in period k; i.e., ckj = 1 if the jth customer has a potential service completion and ckj = 0 otherwise; if server j is busy and if ckj = 1, then there is a service completion. (Each server serves at most one customer in each period. We allow ckj = 1 when the server is idle; then there is no actual service completion.) 19 Let xkj be the jth server occupancy indicator variable for period k; i.e., xkj = 1 if the jth server is busy after both arrivals and service completions in period k, and xkj = 0 otherwise. Let ykj be the jth server pre-service occupancy indicator variable for period k; i.e., ykj = 1 if the jth server is busy after the arrivals but before the service completions in period k. (Our approach allows heterogeneous servers.) We can describe the evolution of the system recursively through the equations xkj = 1 if ykj = 1 and ckj = 0 ; (67) otherwise xkj = 0; andykj = 1 if xk 1;j = 1 or if xk 1;j = 0 and j Pj 1 i=1 xk 1;i < ak ; (68) otherwise ykj = 0. Let nk be the number of busy servers and let bk be the number of blocked arrivals in period k. These are de ned by nk = s Xj=1 xkj ; k 0 ; (69) and bk = [ak + nk 1 s]+; k 1 ; (70) where [x]+ = maxfx; 0g. We start our treatment of the discrete-time model with the following elementary monotonicity result. See Berger and Whitt (1992) and references there for previous related work. Theorem 9. For K 1, the variables xKj ; yKj; nK and bK are nondecreasing functions of the vector IK (x0j; ak; ckj; 1 j s; 1 k K). In the most elementary setting, which includes the discrete-time analog of the M/M/s/0 model as a special case, we can regard the sequence IK fx0j; ak; ckj; 1 j s; 1 k Kg as being composed of mutually independent random variables speci ed exogeneously. More generally, these variables will be dependent. Recall that a collection of random variables fXi : i 2 Ig is associated if the covariances satisfy Cov(f(fXi : i 2 Ig); g(fXi : i 2 Ig)) 0 (71) for all nondecreasing real-valued functions f and g for which the covariance is well de ned; e.g., see p. 29 of Barlow and Proschan (1975), p. 224 of Baccelli and Brem aud (1994) or p. 230 of Glasserman and Yao (1994). The following result is an immediate consequence of basic properties of associated random variables. It closely parallels Theorem 8 of Glynn and Whitt (1989). Theorem 10. If IK fx0j; ak; ckj; 1 j s; 1 k Kg is an associated set of random variables, then so is fIK ; xkj; ykj(1 j s); nk; bk; 1 k Kg. The obvious su cient condition for the condition of Theorem 10 is for the random variables in IK to be mutually independent. This covers the discrete-time analog of the M/M/s/0 model. We now de ne the discrete-time statistical estimators. Let b̂K = K 1 K Xk=1 bk (72) n̂K = K 1 K Xk=1nk (73) âK = K 1 K Xk=1ak (74) Corollary. Under the condition of Theorem 10, b̂K; n̂K and âK are associated. The corollary implies that ̂(t); B̂S(t) and B̂I (t) in the M/M/s/0 model are associated and thus are positively correlated. The M/M/s/0 result requires representing the M/M/s/0 models as a limit of a sequence of the discrete-time models. 20 We now want to treat models in which the variables in IK are not mutually independent. It is natural to de ne the arrival variables ak exogeneously. A natural su cient condition for fa1; : : : ; akg to be associated is for the variables ak to be conditionally increasing in sequence, i.e., for E[f(ak)ja1; : : : ; ak 1] to be nondecreasing in (a1; : : : ; ak 1) for all nondecreasing f and all k, 2 k K; see Theorem 4.7 on p. 146 of Barlow and Proschan. Suppose that ak are binary variables with the intervals between arrivals being i.i.d. Then the arrival process is a discrete-time renewal process. Let pn be the probability that there are n periods between arrivals and let pcn = 1 (p1 + : : :+ pn 1); i.e., pcn is the associated tail probability. It is easy to see that fakg is conditionally increasing in sequence if and only if fpng is DFR (has decreasing failure rate), i.e., if pn pcn pn+1 pcn+1 (75) because P (an = 1jan 1 = an 2 = an k+1 = 0; an k = 1; aj; j n k 1) = pk=pck : (76) We now want to consider dependent service completion variables ckj. For this purpose, let wkj be the server-j prolonged-service indicator variable de ned by wkj = 1 if ykj = 1 and ckj = 0 (77) with wkj = 0 otherwise. If wkj = 0 and wk+1;j = : : : = wk+m;j = 1, then there is a customer in service at server j in period k +m who has been in service for m periods. Note that wkj is increasing in (ykj ; ckj), so that Theorem 10 remains valid if we include the additional variables wkj; 1 j s; 1 k K. Now note that all the variables can be de ned recursively. The order can be x01; x02; : : : ; x0s, n0; a1; y11; y12; : : : y1s, c11; c12; : : : ; c1s; w11; w12; : : : ; w1s, x11; x12; : : : ; x1s; n1; b1; a2; y21; y22; : : : ; y2s, c21; : : : ; c2s; w21; w22; : : : ; w2s; x21; : : : ; x2s; n2; b2, etc. Let H( ) be the history to just before the symbol . For independent service processes, it is reasonable to assume that we have the following conditional distribution property (ckjjH(ckj )) = (ckjjclj; wlj ; 1 l k 1)) w.p.1 : (78) If the arrival process is exogeneously speci ed, then (akjH(ak )) = (akja1; : : : ; ak 1) w.p.1 (79) which is what we considered above. The general theorem is as follows Theorem 11. Suppose that (akjHk(ak )) and ( ckjjHk(ckj )) are conditionally increasing in sequence, then the conclusion of Theorem 10 holds. Proof. Since the other variables de ned in (67), (68), (69), (70) and (77) are all monotone in preceding ak and ckj variables, the condition implies that the variables are conditionally increasing in sequence. Corollary. If (78) and (79) hold and if (ckjjc`j; w`j; 1 j k 1) and (akja1; : : : ; ak 1) are conditionally increasing, then the assumption of Theorem 11 holds, so that the conclusion of Theorem 10 holds. If the service times at each server are i.i.d. then (ckjjc`j; w`j; 1 ` k 1) = (ckjjw`j; 1 ` k 1) : Moreover, paralleling the discussion of (76), ( ckjjw`j; 1 ` k 1) will be conditionally increasing if and only if the service times are IFR (have increasing failure rate). The DFR and IFR results above imply that the estimators are positively correlated in the DFR/IFR/s/0 model. Given DFR interarrival times, and IFR service times, we can represent the model as a limit of discrete-time models where the interarrival times are DFR and the service times are IFR. Hence, we have proved Theorem 8. 9. Summary In this paper we have proposed a new estimator for loss models, a combination of the natural and indirect estimators in (6) and (8). In this combination the simple estimator in (7) can be substituted for the natural 21 estimator, yielding very similar performance. (See Section 6 for supporting theory.) The combination is aconvex combination as in (9) in which the optimal weight p depends on the variances and covariance of thetwo component estimators, as described in (19). We have estimated p using batch means from one run,as indicated in (32). We showed that using the same run causes minor underestimation of variances (seeTable 3). This underestimation could be avoided, if deemed important, by estimating p in a separate pilotrun.In our previous paper we showed that the indirect estimator is much more (less) e cient than the naturaland simple estimators in heavy (light) loading. Here we observed that this same property holds, with evenmore di erence in heavy loading, when there is a nite waiting room.In Section 2 we analyzed the bene t of a combination estimator in general, showing that the variancereduction factor is about (12) 1 when the two variances are very unequal. Examples in Section 4 andtheoretical results in Sections 5 and 7 show that tends to be quite strongly negative, especially underheavy loading, so that the combination estimator provides signi cant variance reduction over the indirectestimator. In Section 5 we proved for the G/G/s/0 model that the correlation approachespc2s=(c2s +c2ae)as the arrival rate increases, wherec2s andc2ae are given in (41) and (43).Even in normal loading, the combination estimator can yield variance reduction because the two com-ponent estimators tend to be negatively correlated. In Section 8 we established correlation inequalities fora large class of models to provide theoretical support for this conclusion. These analytical results do notnearly apply to all models for which the estimation procedure can be applied, but they serve as useful the-oretical reference points. The examples in Section 4 show that the correlation is usually negative. (Thebalanced heavily loaded network without trunk reservation in Section 4.3 is a counterexample to a moregeneral result.)Finally, in Sections 6 and 7 we showed that the variance reduction achieved by the indirect and combina-tion estimators depends upon knowing the parameters and . Thus the variance reduction technique tendsnot to be directly applicable to system measurements in which and need to be estimated. Overall, thepaper continues the longstanding tradition in the simulation literature of showing that, with some thought,simulations can be conducted more e ciently and e ectively.22 ReferencesBaccelli, F. and P. Brem aud. 1994. Elements of Queueing Theory, Springer, New York.Barlow, R. E. and F. Proschan. 1975. Statistical Theory of Reliability and Life Testing in ProbabilityModels, Holt, Rinehart and Winston, New York.Berger, A. W. and W. Whitt. 1992. Comparisons of multi-server queues with nite waiting rooms.Stoch. Models 8, 719{732.Billingsley, P. 1968. Convergence of Probability Measures, Wiley, New York.Bischak, D. P., W. D. Kelton and S. M. Pollock. 1993. Weighted batch means for con denceintervals in steady-state simulations. Management Sci. 39, 1002{1019.Bratley, P., B. L. Fox and L. E. Schrage. 1987. A Guide to Simulation, second ed., Springer-Verlag,New York.Carson, J. S. and A. M. Law. 1980. Conservation equations and variance reduction in queueingsimulations. Opns. Res. 28, 535{546.Eick, S. G., W. A. Massey and W. Whitt. 1993. the physics of the Mt=G=1 queue. Opns. Res. 41,731{742.Fleming, P. J., D. Schaeffer and B. Simon. 1995. E cient Monte-Carlo simulation of a product-formmodel for a cellular system with dynamic resource sharing. ACM Trans. Modeling Comp. Sim. 5,3{21.Fox, B. L. and P. W. Glynn. 1986. Discrete-time conversion for simulating semi-Markov processes.Opns. Res. Letters 5, 191{196.Glasserman, P. and D. D. Yao. 1994. Monotone Structure in Discrete-Event Systems, Wiley, NewYork.Glynn, P. W. and W. Whitt. 1989. Indirect estimation via L = W . Opns. Res. 37, 82{103.Glynn, P. W. and W. Whitt. 1991. Estimating the asymptotic variance with batch mean. Opns. Res.Letters 10, 431-435.Kalashnikov, V. V. and S. T. Rachev. 1990. Mathematical Methods for Construction of QueueingModels Wadsworth and Brooks/Cole, Paci c Grove, CA.Keilson, J. 1979. Markov Chain Models { Rarity and Exponentiality, Springer-Verlag, New York.Lavenberg, S. S., T. L. Moeller and P. D. Welch. 1982. Statistical results on control variableswith application to queueing network simulations. Opns. Res. 30, 182{202.Law, A. M. 1975. E cient estimators for simulated queueing systems. Management Sci. 22, 30{41.Massey, W. A. and W. Whitt. 1993. Networks of in nite-server queues with nonstationary Poissoninput. Queueing Systems 13, 183{250.Meketon, M. S. and B. Schmeiser. 1984. Overlapping batch means: something for nothing? Proc.1984 Winter Simulation Conference, 227{230.Ross, K. W. 1995. Multiservice Loss Models for Broadband Telecommunication Networks, Prentice-Hall,Englewood Cli s, NJ.Srikant, R. and W. Whitt. 1996. Simulation run lengths to estimate blocking probabilities. ACMTrans. Modeling Comp. Sim. 6, 7{52.Whitt, W. 1980a. Some useful functions for functional limit theorems. Math. Opns. Res. 5, 67{85. Whitt, W. 1980b. Continuity of generalized semi-Markov processes. Math. Opns. Res. 5, 494{501.Whitt, W. 1989. Planning queueing simulations. Management Sci. 35, 1341{1366.Whitt, W. 1991. A review of L = W and extensions. Queueing Systems 9, 235{268.Whitt, W. 1992a. Correction note on L = W . Queueing Systems 12, 431{432.Whitt, W. 1992b. Asymptotic formulas for Markov processes with applications to simulation. Opns. Res.40, 279{291.Wolff, R. W. 1989. Stochastic Modeling and the Theory of Queues, Prentice Hall, Englewood Cli s, NJ. heavy loading: = 140thec2a111010casesc2s110110estimated BN (t) 0.30100.30120.34680.3404SD BN (t) 0.000180.000520.00031 0.00053variance ratiosLN 1474124.130.7I 1142097.717.8LI 230160623.3100.8C 253188523.3100.8GC 253188524.1104.7correlation0.7100.9370.6810.847normal loading: = 100thec2a111010casesc2s110110estimated BN (t) 0.07440.07510.16090.1411SD BN (t) 0.000210.000430.00037 0.00056variance ratiosLN 11.515.16.314.7I2.52.61.31.9LI 11.521.66.517.7C 12.328.47.120.4GC 12.328.47.221.4correlation0.7270.8780.6820.863light loading: = 80thec2a111010casesc2s110110estimated BN (t) 0.003940.004030.05870.0402SD BN (t) 0.000460.000091 0.00025 0.00037variance ratiosLN1.392.22.163.9I0.0210.0150.0200.22LI1.090.3281.990.26C1.392.32.125.7GC1.392.32.186.0correlation0.4080.6950.4820.807Table 1. Simulation estimates for the GI/GI/s/0 model with s = 100 and = 1 using exponential (SCV = 1)and hyperexponential (SCV = 10) distributions, based on simulation runs for t = 2 105 (which correspondsto an expected number of arrivals equal to 2 105) using 400 batches. heavy loadingthe = 20= 140= 1200case s = 10s = 100s = 1000B̂N (t).5375.3010.1719SD B̂N (t).000424.00106.00118variance ratiosLN16.094I19.91834LI45.34121539C48.34842049GC48.35042049normal loadingthe = 10= 100= 1000case s = 10s = 100s = 1000B̂N (T ).2144.0751.02338SD B̂N (t).000636.00186.000984variance ratiosLN5.319.910.2I2.53.33.0LI6.623.313.6C7.627.117.1GC7.632.918.1light loadingthe= 5= 80= 930case s = 10s = 100s = 1000B̂N (t).01817.00394.001276SD B̂N (t).000226.000221.000251variance ratiosLN1.131.91.7I0.0160.0350.027LI0.930.280.14C1.131.92.0GC1.132.12.0Table 2. Variance ratios for the M/M/s/0 model with = 1 as a function of and s. The simulationrun length is 106=s with 20 batches in each case (corresponding to an expected number of arrivals equal to( =s)106). M serviceH2 service= 100= 140= 100= 140B̂N (t) mean 0.07588 0.301100.075630.30088SD 0.00085 0.001150.001890.00196B̂C (t) mean 0.07572 0.301260.075680.30125SD 0.00027 0.000061 0.000390.000060SD B̂N (t) mean 0.00086 0.001060.001990.00208SD 0.00012 0.000170.000310.00031SD B̂C (t) mean 0.00023 0.000057 0.000360.000054SD 0.00007 0.000008 0.0000530.000006p̂ mean 0.3630.06240.3770.0645SD 0.0390.00900.0250.0049r̂ mean 0.6260.0880.6280.0743SD 0.0950.01280.0630.0053̂ mean 0.6970.7280.8760.917SD 0.1080.1200.0530.041R̂ mean 0.2270.4040.2990.362SD 0.0780.1530.0680.082Var. Red mean 13.2410.033.01634.0SD 5.89134.014.0904.0min 5.6120.014.8521.0max 20.0618.069.64896.0Table 3. Sample means and standard deviations of estimates for the M/G/s/0 model with s = 100 and = 1based on 20 independent replications of runs each with 106 arrivals and 20 batches. holding-time variabilityloading estimatorc2s = 0:1c2s = 10:0N.000583 .002009heavyI.000054 .000140= 140C.000040 .000048GC.000040 .000047N.000691 .001929normalI.000459 .001225= 100C.000289 .000330GC.000228 .000321N.000190 .000364lightI.001030 .003340= 80C.000165 .000230GC.000161 .000223Table 4. Average standard deviation estimates for four estimators in the M/G/s/0 model for two di erentholding-time distributions with s = 100, = 1 and three values of : = 140, = 100 and = 80, basedon 10 independent replications, each of length 10,000 time units. trk. res.50555101 140.0140.0100130.0200.0140.0i 2 140.0140.010090.040.080.03 140.0140.0100110.040.0120.01 0.30190.38930.07640.23080.25530.2959B̂N (t) 2 0.30220.38920.07580.04200.001450.004753 0.30290.38920.07670.15090.001460.1998total 0.30230.38930.07630.15270.18270.19351 0.00050 0.00047 0.00042 0.00044 0.000490.00038SD B̂N (t) 2 0.00043 0.00036 0.00043 0.00030 0.000038 0.000163 0.00053 0.00041 0.00044 0.00043 0.000040 0.00062total 0.00025 0.00028 0.00030 0.00028 0.000360.000291 0.1850.8610.9470.5260.2370.284r 2 0.1951.5400.8951.60620.385.4643 0.1610.9240.9260.70026.260.266total 0.1300.3150.5430.4900.2880.7271 0.2760.4580.5300.2050.3730.5052 0.5310.5880.4930.5510.0640.4663 0.4070.3160.5540.3730.2210.501total 0.6440.5360.6950.4420.2030.0351 0.8130.8290.2480.6410.9790.545R 2 0.5770.4190.2820.1300.00240.0223 0.7210.7090.2410.4280.00130.560total 0.4930.9360.2520.4810.7990.6321 0.0750.3640.4820.2580.0370.164p 2 0.1130.9390.4630.6481.00070.9013 0.0790.4420.4750.3730.99030.153total 0.0850.0920.0770.2730.1180.351overall 1 36.01.64.55.618.222.7variance 2 45.61.04.43.01.01.5reduction 3 53.61.64.84.81.125.2factor total 120.010.813.58.715.13.0Table 5. Simulation results for six examples of three-link triangle networks with alternate routing. Thecapacity of each link is 100 and all mean service times are 1. All runs have 107 arrivals with 400 batchesafter a warmup of 105 arrivals. 050010001500200025003000 0.03 0.035 0.04 0.045 0.05 0.055 0.06 0.065 0.07 0.075 0.08"pvvar1""pvvar2""pvvar3""pvvar4""pvvar5"hevarianceV(p)isafunctionofpinependentreplicationsoftheM=H2 =s=0examplewith=140,100.Eachreplicationwasoflength10 4(about1:410 6arrivals).

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

1 Importance Sampling and Stratification for Value - at - Risk

This paper proposes and evaluates variance reduction techniques for efficient estimation of portfolio loss probabilities using Monte Carlo simulation. Precise estimation of loss probabilities is essential to calculating value-at-risk, which is simply a percentile of the loss distribution. The methods we develop build on delta-gamma approximations to changes in portfolio value. The simplest way ...

متن کامل

A Novel Analysis of Risk Sharing Effects on Income Inequality in Informal Insurances

T his study aims to demonstrate that joining in risk sharing network leads to the reduction in incomes volatility. In this respect, income variance for a group of members in an informal insurance is modelled in which income variance prior to joining risk sharing network and after joining is analyzed statistically. A Monte Carlo simulation technique is used to prove the result. To extend an...

متن کامل

Variance Reduction Techniques for Estimating Value-at-Risk

T paper describes, analyzes and evaluates an algorithm for estimating portfolio loss probabilities using Monte Carlo simulation. Obtaining accurate estimates of such loss probabilities is essential to calculating value-at-risk, which is a quantile of the loss distribution. The method employs a quadratic (’’delta-gamma’’) approximation to the change in portfolio value to guide the selection of e...

متن کامل

Active Learning with Scarcely Labeled Data via Bias Variance Reduction

In many occasions in real life, we are faced with the problem of classification of partially labeled data, or semi-supervised learning. We consider the special case of scarcely labeled data or when the labeled data is insufficient, and present a principled method which implements active learning in scarcely labeled data to enhance the performance of the learner. This method is based on the rece...

متن کامل

Importance Sampling and Strati cation for Value - at - Risk

LIMITED DISTRIBUTION NOTICE: This report has been submitted for publication outside of IBM and will probably be copyrighted if accepted for publication. It has been issued as a Research Report for early dissemination of its contents. In view of the transfer of copyright to the outside publisher, its distribution outside of IBM prior to publication should be limited to peer communications and sp...

متن کامل

Riemannian stochastic quasi-Newton algorithm with variance reduction and its convergence analysis

Stochastic variance reduction algorithms have recently become popular for minimizing the average of a large, but finite number of loss functions. The present paper proposes a Riemannian stochastic quasi-Newton algorithm with variance reduction (R-SQN-VR). The key challenges of averaging, adding, and subtracting multiple gradients are addressed with notions of retraction and vector transport. We...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1996